{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<small><i>This notebook was put together by [Jake Vanderplas](http://www.vanderplas.com). Source and license info is on [GitHub](https://github.com/jakevdp/sklearn_tutorial/).</i></small>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# An Introduction to scikit-learn: Machine Learning in Python"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Goals of this Tutorial"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- **Introduce the basics of Machine Learning**, and some skills useful in practice.\n",
    "- **Introduce the syntax of scikit-learn**, so that you can make use of the rich toolset available."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Schedule:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Preliminaries: Setup & introduction** (15 min)\n",
    "* Making sure your computer is set-up\n",
    "\n",
    "**Basic Principles of Machine Learning and the Scikit-learn Interface** (45 min)\n",
    "* What is Machine Learning?\n",
    "* Machine learning data layout\n",
    "* Supervised Learning\n",
    "    - Classification\n",
    "    - Regression\n",
    "    - Measuring performance\n",
    "* Unsupervised Learning\n",
    "    - Clustering\n",
    "    - Dimensionality Reduction\n",
    "    - Density Estimation\n",
    "* Evaluation of Learning Models\n",
    "* Choosing the right algorithm for your dataset\n",
    "\n",
    "**Supervised learning in-depth** (1 hr)\n",
    "* Support Vector Machines\n",
    "* Decision Trees and Random Forests\n",
    "\n",
    "**Unsupervised learning in-depth** (1 hr)\n",
    "* Principal Component Analysis\n",
    "* K-means Clustering\n",
    "* Gaussian Mixture Models\n",
    "\n",
    "**Model Validation** (1 hr)\n",
    "* Validation and Cross-validation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Preliminaries"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This tutorial requires the following packages:\n",
    "\n",
    "- Python version 2.7 or 3.4+\n",
    "- `numpy` version 1.8 or later: http://www.numpy.org/\n",
    "- `scipy` version 0.15 or later: http://www.scipy.org/\n",
    "- `matplotlib` version 1.3 or later: http://matplotlib.org/\n",
    "- `scikit-learn` version 0.15 or later: http://scikit-learn.org\n",
    "- `ipython`/`jupyter` version 3.0 or later, with notebook support: http://ipython.org\n",
    "- `seaborn`: version 0.5 or later, used mainly for plot styling\n",
    "\n",
    "The easiest way to get these is to use the [conda](http://store.continuum.io/) environment manager.\n",
    "I suggest downloading and installing [miniconda](http://conda.pydata.org/miniconda.html).\n",
    "\n",
    "The following command will install all required packages:\n",
    "```\n",
    "$ conda install numpy scipy matplotlib scikit-learn ipython-notebook\n",
    "```\n",
    "\n",
    "Alternatively, you can download and install the (very large) Anaconda software distribution, found at https://store.continuum.io/."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Checking your installation\n",
    "\n",
    "You can run the following code to check the versions of the packages on your system:\n",
    "\n",
    "(in IPython notebook, press `shift` and `return` together to execute the contents of a cell)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from __future__ import print_function\n",
    "\n",
    "import IPython\n",
    "print('IPython:', IPython.__version__)\n",
    "\n",
    "import numpy\n",
    "print('numpy:', numpy.__version__)\n",
    "\n",
    "import scipy\n",
    "print('scipy:', scipy.__version__)\n",
    "\n",
    "import matplotlib\n",
    "print('matplotlib:', matplotlib.__version__)\n",
    "\n",
    "import sklearn\n",
    "print('scikit-learn:', sklearn.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Useful Resources"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- **scikit-learn:** http://scikit-learn.org (see especially the narrative documentation)\n",
    "- **matplotlib:** http://matplotlib.org (see especially the gallery section)\n",
    "- **Jupyter:** http://jupyter.org (also check out http://nbviewer.jupyter.org)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
